Fast Flat-Histogram Method for Generalized Spin Models 



S. ReynaQ and H.T.Diep 

Laboratoire de Physique Theorique et Modelisation, 
CNRS-Universite de Cergy-Pontoise (UMR 8089), 
2 avenue A. Chauvin, F-95302 Cergy-Pontoise Cedex, France 
(Dated: February 2, 2008) 

We present a Monte Carlo method that efficiently computes the density of states for spin models 
having any number of interaction per spin. By combining a random-walk in the energy space 
with collective updates controlled by the microcanonical temperature, our method yields dynamic 
exponents close to their ideal random-walk values, reduced equilibrium times, and very low statistical 
error in the density of states. The method can host any density of states estimation scheme, including 
the Wang-Landau algorithm and the transition matrix method. Our approach proves remarkably 
powerful in the numerical study of models governed by long-range interactions, where it is shown to 
reduce the algorithm complexity to that of a short-range model with the same number of spins. We 
apply the method to the q-state Potts chains (3 < q < 12) with power-law decaying interactions in 
their first-order regime; we find that conventional local-update algorithms are outperformed already 
for sizes above a few hundred spins. By considering chains containing up to 2 16 spins, which we 
simulated in fairly reasonable time, we obtain estimates of transition temperatures correct to five- 
figure accuracy. Finally, we propose several efficient schemes aimed at estimating the microcanonical 
temperature. 

PACS numbers: 05.10.Ln, 64.60.Cn, 75.10.Hk 



I. INTRODUCTION 

Long-range spin models have drawn increasing inter- 
est in the last decade, both in the microscopic model- 
ing of a variety of systems ranging from model alloys 
to spin glasses to neural networks and as power- 
ful laboratory frame to investigate fundamental issues in 
the physics of critical phenomena. These include, e.g., 
the effect of dimensionality the crossover from short- 
range to long-range behavior [H, |(| Q , mean- field driven 
phase transitions Q , and possible connections with Tsal- 
lis's non-extensive thermodynamics 0, ITol El- Monte 
Carlo (MC) methods have now gained a prominent role 
in the investigation of phase transitions in these models 
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a major break- 
through was recently initiated by the introduction of a 
(canonical) cluster algorithm able to overcome the algo- 
rithm complexity inherent to long-range (LR) models, 
namely, the need to take a huge number of interactions 
into account at each Monte Carlo step (MCS) [l2]. In a 
recent article, we proposed a generalization of this algo- 
rithm to simulations in the multicanonical ensemble |19| . 
It is the goal of the present work to introduce a general 
and versatile method aimed at embedding any cluster up- 
date scheme in a flat histogram algorithm, with special 
emphasis given to LR spin models. 

Whether short- or long-range interactions are consid- 
ered, canonical MC simulations of long-range spin models 
suffer indeed from severe shortcomings, the use of clus- 
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ter updates notwithstanding. First and foremost, mod- 
els exhibiting first order phase transitions or complicated 
energy landscapes experience supercritical slowing down 
1 2 ("| : the time needed for the dynamics to tunnel through 
free energy barriers grows exponentially with the lattice 
size, leading to quasi ergodicity breaking and unreliable 
statistics. Second, the computation of free energies and 
related thermodynamic quantities is highly involved, and 
a precise determination of the order of the transition is 
often intractable. In practice, these shortcomings pre- 
clude the use of canonical MC algorithms at first-order 
transitions except at modest lattice sizes and in the case 
of weakly first-order transitions. 

An efficient approach aimed at overcoming this limita- 
tion is the simulation in generalized ensembles [2ll |22| , 
in particular its multicanonical flavor initially proposed 
by Berg j^jl |U , reconsidered in the context of tran- 
sition matrix dynamics I25l l26l and recently revisited by 
Wang and Landau [271 |28| . The key- idea here is to arti- 
ficially enhance rare events corresponding to local max- 
ima in the free energy, by feeding the Markov chain with 
an appropriate distribution w(E). In the multicanoni- 
cal ensemble, w(E) is set to the inverse of the density of 
states, so that the resulting dynamics is a random walk 
in the energy space that yields a flat histogram of the 
energy. Other ensembles have been proposed in the last 
decade, including the l/k ensemble, which enhances low- 
energy states [2|| , and very recently, the optimal ensem- 
ble, which aims at optimizing the distribution w(E) with 
respect to the local diffusivity of the random walker, so 
that tunneling times are minimized [3fH l3l| . While still 
broad, histograms engendered by these last ensembles are 
no longer flat; in the optimal ensemble for instance, the 
histogram is slightly peaked around the critical region, 
so that the larger time spent by the random walker in- 
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side the critical region compensates the lower diffusivity 
in this region. 

When implemented through local (i.e., single-spin) up- 
dates |j| , simulations in the multicanonical ensemble suf- 
fer, however, from two serious hurdles. First, while tun- 
neling times — measured in Monte Carlo steps (MCS) — 
are reduced from an exponential to a power law r ~ L z of 
the lattice size, the dynamic exponents z are still substan- 
tially higher than the ideal value z ~ D that should be 
expected from the dynamics of a random walker HH ■ 
This observation, as we will witness in this article, ap- 
plies equally well to effective autocorrelation times and 
to equilibrium times; this represents a serious hindrance 
in terms of scalability, in particular whenever a higher 
precision is desired and large amounts of decorrelated 
data need to be gathered. In this respect, it is important 
to mention that correlations between successive measure- 
ments do not only have an impact on the statistical effi- 
ciency of multicanonical production runs, yet also repre- 
sent a source of systematic error regarding the estimation 
of the density of states |3j( ■ A second impediment to the 
scalability of local-update implementations specifically 
relates to long-range models. Here, the very presence of 
long-range interactions makes the computation of the en- 
ergy — an essential ingredient of multicanonical methods 
— a very time consuming operation, namely, one associ- 
ated with an 0(L 2D ) algorithm complexity. As a result, 
the demand on CPU time needed to generate perfectly 
decorrelated statistics grows as L Z+2D , with z > D. 

In this article, we present a Monte Carlo method which 
successfully tackles these issues by performing simula- 
tions in the multicanonical ensemble using collective up- 
dates. Our methods combines the fast-decorrelating ca- 
pabilities of cluster algorithms with the versatility of flat- 
histogram methods in an efficient and straightforward 
way, and with wide applicability in view. In particu- 
lar, it can be readily combined with any iteration scheme 
dedicated to the estimation of the density of states, e.g., 
Wang-Landau's method [27j or transition matrix algo- 
rithms 25]. Additionally, while our method is presented 
here in the context of long-range spin models, where it 
gives drastic improvements over commonly used meth- 
ods, it is perfectly applicable to any class of models for 
which a cluster algorithm exists in the canonical ensem- 
ble. 

Noteworthy enough, embedding a collective update 
scheme in a multicanonical algorithm is not straightfor- 
ward, however, due to the fundamentally non-local na- 
ture of the multicanonical weight w(E). Indeed, clus- 
ter algorithms depend heavily upon particular symme- 
tries of the model Hamiltonian, which w(E) does not 
keep track of; in particular, there is no longer a canon- 
ical temperature. With simulations of spin models with 
nearest-neighbors interactions in view, several attempts 
have been made at combining cluster updates with mul- 
ticanonical methods in some way or another du ring the 
last decade: the multibond algorithm [S^, HfJ Ha, 
or variants thereof targeting Wang-Landau's algorithm 



|3ll l38| simulate the model in its spin-bond represen- 
tation; Rummukainen's hybrid-like two-step algorithm 
lumps together a microcanonical cluster algorithm and a 
multicanonical daemon refresh [3{j. As opposed to these, 
however, our method relies on a cluster-building pro- 
cess which simply depends on the microcanonical tem- 
perature of the current configuration — a quantity that 
may be readily derived from the estimated density of 
states — in order to determine appropriate bond prob- 
abilities. In particular, it does not require prior knowl- 
edge of the transition temperature, as is the case in the 
multibond method. We further show that our approach 
makes it particularly straightforward to incorporate two 
optimization schemes dedicated to LR models Eoj . 
which cut down the algorithm complexity from 0(L 2D ) 
to 0(L D lnL D ). As a result, the total demand on CPU 
time with respect to uncorrelated data is reduced to ap- 
proximately L 2D \nL D , since cluster updates also lower z 
to around D; where LR models are concerned, the ben- 
efit of cluster updates is thus clearly twofold. Let us 
also mention that, as a by-product, using cluster updates 
provides improved estimators for the statistical moments 
of the order parameter |4lj and for spin-spin correlation 
functions; for instance, the last quantity can be better es- 
timated by counting the fraction of time two given sites 
belong to the same cluster OH ■ Further interesting 
information, including information connected with frac- 
tal geom etry, may also be gleaned from cluster statistics 



Overall, the sharp reduction of the computer load 
brought about by our method allowed us to study q- 
state Potts chains with l/r 1+<T interactions containing up 
to 2 16 spins in a few days on a modern Intel-based com- 
puter. It must be noted that, with standard multicanoni- 
cal methods based on single-spin updates, such huge sizes 
are simply intractable, since the largest size of 2 16 investi- 
gated in this work would demand several months of com- 
putation. As regards dynamic performance, we obtain 
a substantial reduction in the dynamic exponent, from 
e.g., z ~ 1.35(3) to z ~ 1.05(1) for q = 6 and a = 0.7. 
We also show that our method produces faster equilibra- 
tion, lower effective autocorrelation times, and — where 
implementations based on the Wang-Landau algorithm 
are concerned — lower statistical errors in the estimate 
of the density of states, e.g., of nearly an order of mag- 
nitude for q = 6, a = 0.9 and L = 512 spins. As a result, 
we obtain estimates of transition temperatures that have 
a noticeably higher precision than those obtained using 
local updates Q or standard canonical methods [l3l[T^|. 
Finally, in order to check that our method did not pro- 
duce systematic errors, we performed several simulations 
of the two-dimensional seven- and ten-state Potts mod- 
els with nearest-neighbor interactions and sizes up to 
L = 256 x 256. We obtain dynamical exponents close 
to the ideal random- walk value z ~ 2. Although com- 
puted from rather modest statistics, our estimate of the 
interfacial free energy for the largest size reaches a preci- 
sion of four digits. In this respect, our method compares 
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perfectly well with other methods operating in the mul- 
ticanonical ensemble, and represents an alternative way 
for short-range spin models. 

The layout of this article is as follows. In Sec.[Hj after 
briefly reviewing some prominent features of multicanon- 
ical methods, we explain how we combine a multicanon- 
ical weighting with collective updates, with special em- 
phasis given to the detailed balance equation. Section lTTTl 
addresses optimizations dedicated to long-range models. 
Numerical results regarding the dynamic characteristics 
of our method are presented in Sec. IIVI In Sec. ^ we 
compare our results for the two-dimensional Potts model 
with nearest-neighbor interactions, with exactly known 
results, and section is devoted to the investigation of 
the precision of our method in the context of the long- 
range Potts chain with power-law decaying interactions. 
Overall, we pay particular attention to comparison with 
other algorithms operating in the multicanonical ensem- 
ble, especially in terms of tunneling rates, dynamical ex- 
ponents and estimates of thermodynamical averages. Fi- 
nally, we discuss several procedures aimed at estimating 
the microcanonical temperature, and in particular, how 
we can efficiently combine our method with the transition 
matrix approach. 



II. A METHOD TO EMBED CLUSTER 
UPDATES IN A FLAT HISTOGRAM 
ALGORITHM 

Monte Carlo simulations are based on the generation 
of a Markov chain of configurations {crj}, where each 
configuration is assigned a weight w[E(ai)] correspond- 
ing to the probability distribution one wishes to sam- 
ple. In canonical simulations, i.e., carried out at a fixed 
inverse temperature 0, one chooses a Boltzman weight 
w[E(c7i)] = e -P E ( a i) 5 thus thermodynamical averages are 
straightforwardly obtained by computing the appropriate 
moments of the data accumulated at the given temper- 
ature. On the other ha nd, reweighting methods based 
on multihistogramming 46j are hampered at large lat- 
tice sizes by the narrowness of the energy window that is 
sampled, let alone additional supercritical slowing down. 
In the multicanonical ensemble, one allows the dynamics 
to jump across free energy barriers and, from a more gen- 
eral standpoint, to sample wide energy windows, by pro- 
ducing a flat energy distribution over the energy range of 
interest for the problem at hand. This is formally carried 
out by setting w(E) = e~ st - E ^ oc l/n(E), where n(E) is 
the density of states and S(E) is the microcanonical en- 
tropy. This in effect leads to N(E) oc n(E)w(E) = const, 
for the number of visits to energy E. Since the density of 
states is obviously a priori unknown, w(E) is estimated 
using an iterative procedure initially fed from, e.g., a 
canonical guess w(E) = e~^° E at a carefully chosen in- 
verse temperature /?o, a flat guess w(E) = 1, or — when- 
ever feasible — a properly scaled estimate obtained at a 
smaller lattice size. Thermodynamic quantities that de- 



pend solely on the energy, like the specific heat or Binder 
cumulants, can then be computed directly from the es- 
timated density of states. Other quantities, e.g., those 
depending on the order parameter, are obtained through 
a reweighting procedure based on data gathered during 
an additional production run. 

Historically, Berg's recursion scheme 0, was the 
first iteration procedure specifically dedicated to multi- 
canonical simulations. It consists in accumulating his- 
togram entries of the energy during each iteration run, 
and updating w(E) from the histogram of the energy ob- 
tained in a previous iteration run, until eventually the 
histogram becomes flat up to a given tolerance. Entropic 
sampling |49l | more or less boils down to the same key 
principle. Both methods suffer, however, from poor scal- 
ability. Looking at this issue from a slightly different 
angle, the recently proposed Wang-Landau acceleration 
method [2?], updates w(E) in real-time during the 
course of the simulation, performing independent ran- 
dom walks in distinct energy ranges. Since modifying the 
weight of the Markov chain during a simulation breaks 
detailed balance, the amount by which w(E) is modified 
during a given iteration is decreased from one iteration to 
the other until it reaches a negligible value; hence detailed 
balance is approximately restored in the last step of the 
iteration scheme. In this respect, an original approach 
aimed at reducing the statistical error in the density of 
states was recently proposed by Yan and de Pablo |50| . 
whereby the density of states is obtained by integrating 
an instantaneous temperature computed from configura- 
tional information or from a so-called multimicrocanon- 
ical ensemble. Finally, a large class of iteration schemes 
have been proposed that are based on matrices of tran- 
sition probabilities [2^, |2(J |^, |53| or a combination 
thereof with Wang-Landau's algorithm [54ll55j] . Here, the 
density of states is computed through a so-called broad 
histogram equation involving infinite temperature transi- 
tion matrices, where transition matrices keep track of the 
microcanonical average of the number of potential moves 
from one energy levels to another (Sec. I VIII gives more 
details on how our method can efficiently capitalize on 
transition matrices). Historically, procedures based on 
transition matrices were termed flat histogram methods 
in order to distinguish them from Berg's multicanonical 
method, although both approaches in effect yield a flat, 
broad histogram. To sum up, the main benefit of multi- 
canonical methods is twofold: first, a wide energy range 
is sampled, irrespective of the presence of free energy bar- 
riers; second, the methods yield a direct estimate of the 
density of states. 

A local-update implementation of a multicanonical 
algorithm may consist in updating a single spin at a 
time and accepting the attempted move from state a 
to state b with a probability given by W(a — > 6) = 
min[l, e^^"' -5 ^ 6 )]. We now show that the microcanon- 
ical temperature (3(E) defined as dS(E)/dE is a relevant 
quantity for the acceptance rate of this process. Denot- 
ing Eb — E a + e, we expand the probability W(a — > b) 
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for small e, and obtain W(a — > b) ~ min[l, e - ^^ ^ 6 ]. 
This shows that, for small enough energy changes, the 
dynamics is equivalent to that of a canonical simulation 
at an inverse temperature (3(E). Our departure point for 
a collective-update implementation in the multicanonical 
ensemble is thus to build clusters of spins with the same 
bond probabilities as would be given by a canonical sim- 
ulation at inverse temperature 13(E). 

Although our algorithm may be equally well applied 
to other spin models, e.g., models incorporating disor- 
der or exhibiting a continuous symmetry, we now con- 
sider, for the sake of clarity, a generalized ferromag- 
netic spin model with a Z g symmetry, whose Hamilto- 



nian reads H — — J2i<j Jij 
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Here Ja > and 



the Oi variables can take on integer values between 1 
and q. Taking guidance from Swendsen- Wang's cluster 
algorithm 56], we start from an empty bond set, con- 
sider each pair of spins {er^c.,} in turn, and activate a 
bond between them with a bond probability given by 



i(E a ) = 5„ ucrj [1 



] , where E a is the cur- 



rent lattice energy and (3(E a ) the inverse microcanonical 
temperature at energy E a . Efficient ways of estimating 
P(E) are considered later on in Sec. IVIII Then, we iden- 
tify clusters of connected spins using, e.g., a multiple- 
labeling scheme |57^ . draw a new spin value at ran- 
dom for each cluster, and accept the attempted move 
with an acceptance probability Agjp(a — > b) which en- 
sures that detailed balance is satisfied. The derivation 
of this probability may be carried out in the following 
way. First, the total acceptance probability W(a — > b), 
i.e., the quantity that enters detailed balance in such 
a way that e~ s ^W(a -> b) = e - s ^W(b -► a), is 
split into two terms P(a — ► b) and Agjp(a — * 6) repre- 
senting a proposed update probability and an acceptance 
probability for the proposed update, respectively. It is 
straightforward to show that the choice Agjp(a — > 6) = 



P(b^a) S(E„)-S(E h ) 



satisfies the detailed balance 



P(a^6) 1 

equation. Let us denote B the set of active bonds over the 
complete graph Q engendered by all possible interactions: 
the proposed update probability is given by the probability 
to construct a given set B from an empty bond set, i.e., 



P(a -> b) 



{E a 



n [i 

b i} eg\B 



,(Ea) 



„ P(b^a) 
P(a^b) ' 



After simplification, we obtain for 

e J l3 l3(E b ) _ j 



a f}(E„)E b -l3(E a )E a 



n 

b H eB 



whence 



^flip( a -> b ) = min 



ME a ) 



n 



Pij{E b 



(1) 



where a(E) = S(E) - f3{E)E and p lj {E) = e J *i^ - 1. 
This expression can be further simplified if we consider 



long-range models whose coupling constant depends only 
on the distance between spins, i.e., Jy = J (I), where 
I = dist(i, j). We have for ^4fljp(a — ► b): 



Agjp(a — ► b) = min 



1, 



ME a ) 
a(E b ) 



n 

l>0 



\ME b y 


B(iy 


[Pl{Ea)_ 





, (2) 



where B(l) stands for the number of bonds of length I. 
It is worth mentioning that, if one looks at this equation 
from the standpoint of canonical simulations at inverse 
temperature /3q, we have w(E) — e~P° E ; whence j3{E) = 
(3 and a(E) does no longer depend on E. As a result, 
the acceptance rate ^4gjp (a — * b) is equal to 1 and we are 
back to the original Swendsen- Wang algorithm. 

It is also crucial to underline that it is the micro- 
canonical temperature, i.e., the lattice energy in the first 
place, which entirely governs the construction of clus- 
ters; indeed, for a given lattice configuration at energy 
E, bonds are placed as if the model were simulated at 
its microcanonical temperature using a Swendsen- Wang 
algorithm. As a result, cluster bond probabilities change 
continuously as the lattice configuration walks along the 
available energy range of the random walk, so that, e.g., 
small clusters are built in the upper energy range and 
conversely large clusters in the lower energy range. 



III. OPTIMIZATION FOR LONG-RANGE 
MODELS 

A. Computing the lattice energy through FFT 
acceleration 

As is apparent in Eq. JJ, determining the acceptance 
rate of a cluster flip demands that we compute the energy 
of the new (attempted) lattice configuration, which for 
long-range models is an 0(L 2D ) operation. This is sim- 
ilar to the local-update case, where performing one MC 
step, i.e., updating L D spins subsequently, takes a CPU 
time proportional to the square of the number of spins, 
seeing that L D operations are needed after each single 
spin update to compute the new partial energy between 
the updated spin and the rest of the lattice. Recently, 
Krech and Luijten proposed an algorithm that is able 
to compute the energy of a model governed by trans- 
lation invariant interactions in 0(L D \nL D ) operations 
[40| . This method leans on the convolution theorem and 
the Fast Fourier Transform (FFT), for which numerous 
efficient radix-based implementations are available. As a 
result, updating the lattice configuration globally rather 
than a single spin at a time allows us to cut the 0(L 2D ) 
complexity down to an 0(L D \xiL D ) one. A crucial point 
to be noted here is that this reduction is absolutely in- 
tractable with single-spin updates, owing to the very rea- 
son that the energy would have to be computed anew af- 
ter each single-spin update; this requires L D operations, 
and an FFT algorithm would output no gain at all. 



5 



Let us assume that we can write down the model 
Hamiltonian as a sum of dot products, i.e., H = 
~ \ JijS(i) ■ S(j), with Jij invariant by translation. 
This is straightforwardly done when q = 2, since in this 
case the dot product reduces to a product of scalar Ising 
spins. As we will witness in a moment, the presence of 
a delta Kronecker symbol in the Hamiltonian whenever 
q > 2 requires, however, a minor transformation of the 
Hamiltonian. For simplicity, we consider hereafter a one- 
dimensional lattice with an interaction J(l) depending on 
the distance I between spins. The line argument is similar 
in higher dimensions, with the sole exception that multi- 
dimensional Fourier transforms are then performed. The 
Discrete Fourier Transform (DFT) of the spin sequence 
{S(l)}i =1 ... L reads 

l=L-l 

S{k) = S{l)e- i2M ' L , 
and reciprocally, 



k=L-\ 



£(0 = v S{k)e^ k ' 



kl/L 



k=0 



Similarly, we define the DFT of the sequence of coupling 
constants {</(/)} as 



l=L-l 



kl/L 



1=0 

where J p b c {l) incorporates the effect of Infinite Image 
Periodic Boundary Conditions that is, J p &c(0 = 

J(l + mi); for algebraically decaying interac- 
tions, this sum can be exactly expressed in terms of Hur- 
witz functions Q. We diagonalize the original Hamilto- 
nian H by rewriting it in terms of the J(k) and S(k), 



1 



k=L-l 

" £ J(k)S(k) ■ S(-k), 

k=0 



In the case of the three-state model, this transformation 
is equivalent to mapping Potts variables onto the complex 
plane, i.e., a — > — e 427r (°'- 1 )/ 3 i anc l writing the dot 
product • as Re{S^ S^>}. In this case, the 

term S(k) ■ S(—k) becomes ^(fc)! 2 , where S(k) is the 
DFT of the sequence of (complex) variables {S 1 ^}. This 
reduces by one the number of O(L) operations required, 
since computing a dot product is no longer required. 

For q > 3, spin vectors on the (q — 1)— dimensional 
hypersphere may be determined by using hyperspherical 
coordinates in D = q — 1 dimensions, i.e., for the zth 
vector £W (with 1 < i < q), 



CO 


= sin( 


,(<) 


sinl 


9». 


. . sin ( 


1 q-3 


(0 
x 2 


= sin( 




sinl 


?«. 


. . sin f 


,(<) 


T W 

x 3 


= sin( 




sinl 


'1 ■ 


. . COS ' 


flW 
9-3 



sin 



cos 



(«*) 

9-2 

(i) 
9-2 



2^1 2 = sm fl^ cos u 2 



(i) 



(0 

C 9-l 



j(t) 



for 1 < i < q - 2, ef = aj for 



We initially set 

j < i < q and 1 < j < q - 3, and ^ = -9% = 
Uq-2- There remains q — 2 angles ay to be determined 
from q - 2 equations tfW • S^ + ^ = -l/(g - 1) with 
1 < * < 9 — 2, from where we obtain a% = arccos^j-, 
cosa, + i = . c .° s " J , and thus by induction cos a, = -^r. 

J' l+cos Qj ' J J q—j 

After a bit of algebra, we find — (0, . . . , 0, 1), and 



S w = ( 0, . . . , 



q(q-i) 



j-i-i terms 



(«-!)(«-» + 1)' 



r (i) -, ^ 

i x 9 -i-!+i/i<i<»' 1 J 



where it should be emphasized that S(—k) and S'(fc) are 
complex conjugates, since the original vectors S(l) have 
real coordinates. By relying on an FFT radix-2 algo- 
rithm, the task of computing the lattice energy is conse- 
quently reduced to O(LlnL) operations. 

For q > 2, the Kronecker delta symbol in the Hamil- 
tonian unfortunately rules out the previous diagonaliza- 
tion. One way to resolve this issue is to map the g-state 
Potts model onto a (q — 1)— dimensional vector model, so 
that the Kronecker delta function in the original Hamil- 
tonian is turned into a dot product. We define a one-to- 
one mapping between each Potts spin a = 1 . . . q and a 
unit-length vector S 1 -^ belonging to a (q— l)-dimensional 



hypersphere, so that S (a) • 5 ?(<T ' ) = 



9-1 



It is straight- 



forward to prove that 'J2 a = 0, and that 



H = 



— Y 



j( i -j)s^-s^ + -T / j(i-j) 



i<j 



for 1 < i < q, where the (q — 1 — i+j')th coordinate reads 



(0 

C q-l-i+j 



(q-l)(q-l-i + j)(q-i + j) 



and S^^ 1 " 1 differ only in the sign of their first coor- 
dinate x\. Once these vectors have been computed for a 
given q, which may be done on start-up, determining the 
lattice energy requires, first computing the DFT Sj(k) of 
each sequence of coordinates {<S(0"£^}i=i,...,L) an d then 
evaluating the double sum J^kZo ~ ± 2?=i J(k)\Sj(k)\ 2 ■ 
As a result, the whole operation is associated with a 
0(qL\nL) complexity — or in general 0(qL D \nL D ) — , 
provided the implementation relies on a FFT radix algo- 
rithm. As a by-product, it should be noted that once the 
Fourier components have been computed, it is straight- 
forward to derive the Fourier transform of the spin-spin 
correlation functions at any inverse temperature (3 from 
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gfj(k) = 1/1/ {J2j =1 \Sj(k)\ 2 J , where the mean value 

is obtained from a reweighting procedure. At large lat- 
tice sizes, the requirement that L Fourier components be 
stored at each MCS may constitute a significant challenge 
in terms of computer memory; in this case, a practical 
work-around consists in computing microcanonical aver- 
ages for each energy level visited during the simulation, 
and then to perform the reweighting procedure directly 
from these microcanonical averages. In the case of long- 
range interactions, careful attention must be paid, how- 
ever, to the influence of the discretization of the energy 
axis in terms of systematic error. 



B. Efficient cluster construction for long-range 
interactions decaying with the distance 

For long-range spin models, building a new cluster at 
each MCS takes of order L 2D operations, since L D {L D — 
l)/2 pairs of spin are considered in turn for bond activa- 
tion. When interactions decay with distance, the proba- 
bility of adding a bond between two spins falls off quite 
rapidly as the distance between them increases. A sig- 
nificant amount of time during the construction of the 
cluster is thus wasted because an overwhelming number 
of bonds are considered for activation which have only a 
negligible probability to be activated. Even in the case 
of interactions decaying as — with a close to 

0, does the bond count never exceed a few percent of 
the whole number of available bonds. In this respect, 
switching from a local- to a global- update scheme might 
well be an ill-fated choice as the gain in terms of auto- 
correlation time is spoiled by the exceedingly time con- 
suming construction of the cluster. However, an efficient 
construction method was proposed by Luijten and Blotc 
in the recent past [l^. with an efficiency that is inde- 
pendent of the number of interactions per spin, and a 
CPU demand that scales roughly as L D . The rationale 
behind this method is to use cumulative probabilities, 
whereby instead of considering each spin in turn for ad- 
dition to a given cluster, it is the index of the next spin 
to be added which is drawn at random. We now give 
a sketchy outline of the method in the context of long- 
range chains. Extensive details may otherwise be found 
in [l2l l58| . First of all, the probability to add a bond is 
split up into two parts, namely, (i) a provisional proba- 
bility iTi(E) (hereafter simply denoted iri) depending on 
the distance I = \i — j\ between spins and on the lattice 
energy E, and (ii) a factor f(ai,aj) controlled by the 
spin values, e.g., a Kronecker delta symbol in the case of 
a Potts model. If denotes the index of the current spin 
to which we are adding bonds (i.e., spin indices are con- 
sidered to be relative to the current spin), then the pro- 
visional probability of skipping k — 1 spins and bonding 
the current spin with a spin at position k > is given by 
Po(k) — n^nill - 7r m) 7r fe- From there, one builds a table 
of cumulative probabilities Co(ji) = Yjk=i Po{k) for all 



ji > 0, so that the index ji of the spin to be bound with 
current spin is obtained by first drawing a random num- 
ber < r < 1 and then reading out ji from the table, i.e., 
ji is such that Co(ji — 1) < r < Co(ji). Standard binary- 
search algorithms may be used for this purpose. Last, a 
bond is activated between spins and j\ with a probabil- 
ity /(<7oi Ojx)) an d we proceed further with the computa- 
tion of the index j2 > ji of the next spin to be bound with 
current spin 0. The corresponding provisional probabil- 
ity thus becomes Pj 1 (k) — Yim=j 1 +i(^ ~ n m)^k, and the 
cumulative probabilities read Cj 1 (j'2) = Y^k=j 1 +i Pji (^)- 
The same procedure is repeated for {j'3, j'4, . . .} until we 
draw a j a > L, in which case we jump to the next 
current spin, which in a one-dimensional model is the 
nearest-neighbor of the previous current spin. In ad- 
dition, there are two formulas which make it easier to 
compute cumulative probabilities: first, one can show 
that C (j) = 1 - exp[-/3(E)jy k=l J{k)], where E is 
the energy of the current configuration, and second, the 
cumulative probabilities Cj a (j a +i) can be straightfor- 
wardly derived from the Co(j) coefficients through the 
relation C ja (j a+1 ) = Co(j 1 °+^ ) o ~^ ) (i ° ) . It follows from the 
last relation that, instead of building a look-up table for 
each Cj a {j a +i), we may as well draw a random number 
< r < 1, transform it to r' = r[l — Co(j a )] + Co(j a ), 
and choose the next spin to be added from the relation 
CoC/a+i — 1) < r' < Co(j Q +i). In practice, we thus sim- 
ply need to compute a single look-up table filled with 
~Y^ k=1 J{k) for each j at the beginning of the simulation, 
from where we will derive the Co(j') coefficients at each 
new MCS corresponding to a lattice configuration with 
a given energy E. This last task requires of order L D 
operations. To sum up, the construction of each cluster 
thus consists in choosing a " current" spin amongst L — 1 
possible spins in turn, e.g., starting from the leftmost 
one, and then activating bonds between the current spin 
and other spins located to its right by drawing a random 
number, scaling it, and selecting the bond indices from a 
look-up table containing the Co(j) coefficients at energy 
E. Once each spin has been considered as a current spin, 
a cluster multiple labeling technique can eventually be 
used to identify every set of spins actually belonging to 
the same cluster |57| . 



IV. NUMERICAL TESTS OF ALGORITHM 
PERFORMANCE 

In this section, we address the performance of our al- 
gorithm in terms of dynamical behavior. Since our work 
focuses mainly on long-range spin models, we decided to 
perform intensive numerical tests on the one-dimensional 
g-state Potts chain with LR interactions l/\i — j\ 1+a de- 
caying as a power law of the distance between spins. 
The rich phase diagram of this model, and the fact that 
several numerical studies have been carried out on this 
model in the recent past, makes it a perfect test-case. For 
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the sake of comparison with other numerical methods, 
and in order to ensure that our algorithm did not produce 
systematic errors, we also performed several tests on the 
two-dimensional model with nearest-neighbor (NN) in- 
teractions, for which exact results arc known (see 59, 60] ; 
also references in ) . Both models are known to exhibit 
a first-order transition for an appropriate set of parame- 
ters, namely, q > 4 for the NN model [62], and a < a c (q) 
for the LR one, with for instance cr(3) = 0.72(1) 0. We 
chose a set of parameters that would allow us to observe 
both weak and strong first-order transitions, and con- 
centrated on several indicators of performance, reliabil- 
ity, and scalability: these include tunneling, equilibrium 
and effective autocorrelation times, and mean acceptance 
rates. These indicators inform us on the efficiency with 
which the Markov chain reaches the equilibrium distri- 
bution and explores the phase space. They also tell us at 
what rate successive measurements decorrelate from each 
other; hence what amount of resources is needed to ob- 
tain reliable statistics. Overall, they are therefore good 
indicators as to whether CPU resources are efficiently 
utilized or not. As regards scalability, wc also computed 
the dynamical exponents associated with tunneling and 
equilibrium times; these indicate how fast needs in CPU 
time grow with the lattice size. 

All densities of states were calculated by means of the 
Wang-Landau algorithm, whereby, starting from an ini- 
tial guess of the density of states n(E), we update n(E) 
after each visit to energy level E according to the rule 
h\n(E) <— In n(E) + In/, where In / is hereafter termed 
Wang-Landau modification factor. In the case of LR 
models, the unequal spacing of energy levels and the 
existence of energy gaps in the vicinity of the ground 
state required that we introduced a few changes over the 
original version. In particular, using an interpolator for 
h\n{E) turned out to be mandatory in order to compen- 
sate for the finite width of histogram bins — as would 
also be required for models having a continuous sym- 
metry; indeed, we observed that using large bins tends 
to strongly reduce the acceptance rate if no interpola- 
tor is used. Bezier splines provide good interpolators, 
although a linear interpolation with a slope given by the 
microcanonical temperature (3(E) also proved to be par- 
ticularly efficient whenever this last quantity was made 
available by other means, e.g., the transition matrix. 

For small and medium lattice sizes, we systemati- 
cally performed all simulations twice, first with standard 
single-spin updates (SSU) and then with our method em- 
bedding cluster updates (CU). We give an estimate of the 
error in the density of states obtained from both types of 
update schemes. For the largest lattice sizes we studied, 
however, the SSU implementation simply turned out to 
be impracticable, due to either exceedingly high tunnel- 
ing times, and - for LR models - excessive CPU demands, 
and we present results for the CU algorithm only. 
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FIG. 1: Mean acceptance rate as a function of the energy per 
spin for the six-state long-range Potts chain with a = 0.7, 
and L = 1024 spins. The dashed line shows the estimated 
inverse microcanonical temperature. The vertical dotted lines 
indicate the position of the histogram peaks corresponding to 
the ordered and disordered phases. 

A. Phase space exploration and mean acceptance 
rates 

As opposed to the (canonical) Swendsen-Wang clus- 
ter algorithm, the acceptance rate of our algorithm - 
Eq. — is not trivially equal to unity. Still, it is 
tightly related to the efficiency with which the Markov 
chain wanders about the phase space, since a low accep- 
tance rate would lead to very repetitive dynamics. In 
this view, it is instructive to compute an approximate 
analytical expression of this acceptance rate when the 
initial and the final energies E a and E\> differ only by 
a small amount. Writing E\, = E a + e, and carrying 
out a series development to first order in e, one obtains 
v4 flip = min (1, 1 + A(E a )dE), where 

A(E a ) = P'(E a ) ( £ J tf 1+J y F ( y ~ \E a \ ) , 

\b tJ eB Pin^*) J 

with the same notation as in Sec. [H] We wish to ob- 
tain an estimate of the first statistical moments of A(E). 
We hereafter consider the case of a model with nearest- 
neighbor interactions (J = 1), for which we can carry out 
an exact derivation. The last expression simplifies to 

A(E a )=(3'(E a ) ( B ±±*-\E a \), 

where B stands for the total number of bonds and p = 
p(E a ) = e 13 ^^ — 1. At a given energy E a , at most \E a \ 
bonds may be activated. The probability to activate a 
bond is Tr(E a ) = p(E a )/[l+p(E a )\ = 1 - e -^) ; hence 
the probability to create a bond set B containing B bonds 
writes P(B) = (^) [ir{E a )] B [l - ir(E a )]\ E °\- B , which 
may be reexpressed as 
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From there, we can derive the average bond count, (B) 
1 



l-^alrf - - This allows us to rewrite A(E a ) as 



A(E a )=(3'(E a )^±P(B-(B))- 

hence (A(E a )) = 0. The variance of A(E a ) is thus pro- 
portional to the variance of the bond count distribution, 



(B 2 ) (BY 



\E„ 



(i+p) 



which yields 



V ( A ( E a) 2 ) = SA(E a ) = \0'(E a )\< 



\E n 



exp/3(£ )-l 




For a given e > 0, one half of all attempted cluster flips 
thus leads to an acceptance rate which is lower than 1, 
the other half saturating at unity. Assuming a gaussian 
distribution for A(£'), with the standard deviation com- 
puted above (which is valid for large enough lattice sizes), 
the mean acceptance rate is readily obtained from the 
mean value of a gaussian distribution centered at unity 
and truncated above 1, which yields 
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In the case of interactions depending on the distance 
I between spins, one may observe that the average en- 
ergy is related to the average number of bonds of length 
I by -(E) = J2i>o J ( / ) i |f- ( B which shows that 
(A(E)) = also in this case. 

At a first-order transition, (3(E) varies smoothly be- 
tween the energy peaks of the ordered and disordered 
phases, which ensures that A(E) remains small. The 
mean acceptance rate for the six-state LR Potts chain 
with a = 0.7 and L = 1024 spins is sketched in Fig. ^ 
While the acceptance rate is close to 1 inside the range of 
phase coexistence, the variance of A(E) increases when E 
lies outside the range of phase coexistence, and therefore 
leads to a reduction in the acceptance rate. We observe 
that this diminution is less marked at low-energy levels, 
for the energy cost associated with flipping a small num- 
ber of big clusters is lower than that associated with ran- 
domly updating a great deal of small clusters, and E\,—E a 
is consequently lower in the last case. It is worth under- 
lining, however, that the energy range of interest in the 
analysis of first-order phase transitions spans an interval 
which is only moderately larger than that corresponding 
to phase coexistence, the only requirement being that 
metastability plateaus Q and histogram peaks must be 
clearly visible. As a result, the fact that the mean accep- 
tance rate for cluster flips remains well above 90% inside 
this range of energy represents already an improvement 
of a factor 3 with respect to the standard multicanonical 
approach, where we obtained acceptance rates oscillating 
around 30%. 



FIG. 2: Tunneling times for the long-range Potts chain with 
q — 3, a — 0.4 (dashed lines) and 0.6 (dotted lines), and 
q = 6, a — 0.7 (solid lines). Triangles refer to the SSU imple- 
mentation, while squares indicates estimates for our method 
(CU). Dynamic exponents z were determined from a fit to the 
power law r e ~ L z . 



TABLE I: Dynamic exponents z for the g-state Potts 
chain with power-law decaying interactions (a) and its two- 
dimensional counterpart with nearest-neighbor interactions 
(b). z(SSU) and z(CU) refer to single-spin and cluster up- 
dates respectively, while z mu Bo and z mu cius make reference 
to the multibond method [^3 and Rummukainen's multi- 
microcanonical cluster method 39] applied to the NN model. 



q 


a 


z(SSU) 


z(CU) 




6 a 


0.7 


1.35(3) 


1.05(1) 






0.6 


1.48(2) 


1.11(1) 




3 a 


0.4 


1.13(2) 


0.89(1) 




7 b 




2.60(4) 


1.82(2) 


1.84 1.82(3) 


10 b 




2.87(4) 


2.23(1) 


2.1 



B. Dynamic properties 

Where performance measurements at first-order tran- 
sitions are concerned, tunneling times have thus far been 
regarded as one of the most meaningful measurement pa- 
rameters [12, E01 E3 • They are defined as one half of the 
average number of MCS needed for the walk to travel 
from one peak of the energy histogram to the other - 
where peaks are defined with respect to the finite-size 
transition temperature - and turn out to represent a 
fairly good indicator of the interval between roughly in- 
dependent samples. 

Results for the LR chain with q = 3 and 6 are shown 
in Fig. |21 Dynamic exponents z were determined from 
a fit to the power law r e ~ L z , and are summarized 
in Table [I] We can witness a substantial reduction for 
both the LR and the NN models, with exponents close to 
and sometimes even below the ideal random-walk value 
z = D. As regards the NN model, our values compare 
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extremel y w ell with those obtained with the multibond 
method |32| and with Rummukainen's hybrid-like two- 
step algorithm [2j|, although these approaches and ours 
differ markedly in the way clusters are constructed. 

It should be mentioned that the distance Ed — E D the 
random walker must travel, i.e., the energy gap between 
the peaks of the histogram, does not scale linearly with 
the number of spins. This feature is especially appar- 
ent for long-range interactions, where Ed — E Q grows all 
the more faster with increasing lattice size that a comes 
closer to 0. As a result, the power law r e ~ L z yields 
dynamical exponents which are underestimated with re- 
spect to the value given by a power law of the form 
r e ~ (Ed — E ) z (up to a dimensional factor 2 for the 
NN model). For instance, we would obtain z = 1.40(3) 
instead of z = 1.35(3) for q = 6 and a = 0.7, and 
z = 1.10(1) instead of z — 1.05(1). Where the perfor- 
mance in terms of CPU demands is concerned (and in 
particular if one is interested in how it grows with the 
size of the system), we think however that the traditional 
definition r e ~ L z is more meaningful. 

While tunneling times represent a practical way to es- 
timate the efficiency with which the random walker drifts 
along the energy landscape, they are subject to two limi- 
tations. First, they cannot be properly defined in the case 
of second-order phase transitions, since the histogram of 
the energy does no longer exhibit two peaks. Second, 
there is no direct connection between tunneling times 
and autocorrelation times, which makes it difficult to es- 
timate the optimum interval between measurements that 
will yield perfectly uncorrelated data, and thus minimum 
statistical error in estimates of thermodynamic data. It 
is worth mentioning here that computing integrated au- 
tocorrelation times naively from the set of measurements, 
i.e., just as is usually done in the canonical case, simply 
makes no sense when simulating in the multicanonical en- 
semble, because the quantities we are interested in are, in 
the first place, reweighted averages of thermodynamical 
data Hj. 

Therefore, alternate definitions have been proposed, 
which try to circumvent these limitations. On e ap proach 
is to compute the so-called round-trip times |64| . which 
are computed from the number of MCS needed to get 
across the whole energy axis, that is, from the ground 
state to the upper energy level. Although round-trip 
times may be determined for any order of phase tran- 
sition, they present unfortunately no more connection 
with statistical errors than do tunneling times. On the 
contrary, multicanonical effective autocorrelation times, 
which were first introduced in the framework of the multi- 
bond algorithm |3^| , offer a direct comparison with expo- 
nential or integrated autocorrelation times of traditional 
use in canonical simulations. Mimicking the canonical 
case, the effective autocorrelation time r e g can be de- 
fined for any thermodynamic variable by inverting the 
standard error formula e@ = a^T^/N, where N stands 
for the total number of (possibly correlated) measure- 
ments, a 2 denotes the variance of the (reweighted) ther- 
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FIG. 3: Effective autocorrelation time T e g- for q = 6, a = 0.9 
and L = 512 with (a) cluster updates (b) single-spin updates. 
The effective transition temperature defined from the peak of 
the specific heat is T C (C V ) = 0.7163(2). 



modynamic variable 0, e.g., (E 2 *) — (E) 2 , and e 2 , is the 
squared statistical error in the same variable. The error 
may be estimated either from resampling or (jackknifc) 
blocking procedures, or by performing multiple indepen- 
dent runs. Since both the variance and the error depend 
on the reweighting temperature, the previous definition 
obviously yields an effective autocorrelation time which 
also depends on the temperature. 

We now discuss our results for effective autocorrela- 
tion times obtained for the six-state LR Potts chain with 
a = 0.9 and 128 < L < 1024 spins. For this value 
of a, the model exhibits a very weak first-order tran- 
sitions with no clearly visible histogram peaks for sizes 
below L ~ 2000. The choice of medium lattice sizes was 
dictated by the fact that we computed the error from 
multiple independent runs (around 20 runs of 10 6 MCS 
each), which we found a more reliable way of comput- 
ing the statistical error than using a blocking procedure. 
Figure shows the dependence of t b q on the tempera- 
ture for L — 512. For both algorithms, T e g exhibits a 
peak in the vicinity of the effective transition temper- 
ature T C (C V ) = 0.7163(2). As expected, the reduction 
brought about by cluster updates in terms of correlation 
between measurements is marked, especially in the tran- 
sition region, where single-spin update lead to a critical 
slowing down similar to the one encountered in canoni- 
cal simulations. This behavior is consistent with the very 
general observation reported recently in l30j in the frame- 
work of the optimal ensemble, and also in |65| in the con- 
text of equilibration time for multicanonical algorithms 
(see also the next paragraph for more details on this is- 
sue) , whereby the random walker diffuses at a slower pace 
in the critical region. In this respect, cluster updates op- 
timize the diffusive current of the random walker in the 
critical region in much the same way as do the optimal 
ensemble weighting proposed in [3pj . yet with a distinct 
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TABLE II: Effective autocorrelation times at the transition 
temperature defined from the location of the peak of the spe- 
cific heat, for the six-state LR Potts chain with a — 0.9 



128 
256 
512 
1024 

z 



475 

1390 

3960 

12700 

1.6(1) 



155 

310 

635 

1370 

1.0(1) 
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FIG. 4: Fit of effective autocorrelation times r e g- to the power 
law r g g- oc L a for the six-state Potts chain [a = 0.9 and 
L = 512) with (a) cluster updates (b) single-spin updates. 



strategy: in the latter, the error is reduced by allowing 
the walker to spend more time in the critical region than 
in the rest of the energy axis; in our approach, it is the 
decorrelating capability of the move update itself which 
reduces the statistical error in the transition region. As 
is well known, however, cluster updates are especially ef- 
ficient at the percolating threshold, and the reduction in 
terms of correlation is large because bond probabilities 
are governed by the microcanonical temperature. This 
interpretation is clearly underpinned by our investiga- 
tion of the effect of poor estimates of (3(E) on tunneling 
times, presented later in Sec. IVIII Finally, we focus on 
the scaling behavior of autocorrelation times. Table |n] 
reports our results for L ranging from 128 to 1024 spins, 
where t q q is evaluated at the effective transition temper- 
ature determined from the peak of the specific heat. Our 
method gives smaller autocorrelation times already for 
L = 128 spins. From these values, we also determined 
the associated scaling exponents by a fit to the power law 
Tgg oc L z (Fig.HJl, and obtained a highly satisfying value 
of z ~ 1.0(1) with cluster updates. 

We conclude the discussion on the dynamic charac- 
teristics of our algorithm with an investigation of equi- 
librating properties. As opposed to canonical simula- 
tion, estimating equilibrium times has been much less 
common in the context of multicanonical simulations; 
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FIG. 5: Plot of X 2 (t)/ r f° r the six-state Potts chain (a = 
0.9, L = 512) using (a) cluster updates and (b) single-spin 
updates. The regression was carried out over a histogram 
containing 20 bins populated from 1000 runs, all starting in 
ground state configuration but with distinct random seeds. 



the non-linear relaxation function, while very informa- 
tive when the equilibrium distribution is driven by a 
Boltzman weight |66| , is of limited use indeed if the en- 
gendered distribution is flat. Recently, however, an ef- 
ficient procedure aimed at estimating equilibrium times 
for any equilibrium distribution was proposed by Guerra 
and Munoz |65|. This procedure relies on a x 2 regres- 
sion with respect to the (expected) flat equilibrium dis- 
tribution V(E). Starting from the same initial lattice 
configuration, n Markov processes are run with distinct 
random seeds, and at each MC step t, a histogram of 
the energy Vt(E) is filled with the value of the energy 
of each process. Asymptotically, V t (E) should approxi- 
mate the expected flat distribution V(E) oc n(E)w(E). 
In order to estimate the equilibrium time in a more quan- 
titative way, a x 2 (i) deviation of Vt(E) with respect to 
the flat distribution is carried out at each MC step t, 
i.e., X 2 (t) = E E ( v t( E ) ~ nV(E)f/(nV(E)), where the 
sum runs over histogram bins. For large n, and pro- 
vided equilibrium has been reached, the distribution of 
X 2 (t) over to experiments obeys a x 2 law with a num- 
ber r of degrees of freedom given by the number of his- 
togram bins minus one, that is, with a mean equal to 
r and a standard deviation given by yj2r/m. Due to 
the intensive demand in CPU required by this proce- 
dure, we restricted our estimation of equilibrium times 
to the single case q = 6 and a — 0.9. We performed 
n = 1000 Markov processes for sizes between L = 128 
and L = 512, and estimated the equilibrium time from a 
single experiment (that is, to = 1) by simply monitoring 
the time needed for x 2 (t) /r to reach unity and then stay 
within the interval [1 — 2a /r, 1 + 2a /r). As illustrated in 
Fig- El relying on a single experiment leads to quite large 
error bars, yet this is sufficient for our purpose. From 
the graphs of X 2 (t) we read r eq = 4500 ± 500 MCS and 
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TABLE III: Equilibrium times for the six-state LR Potts 
chain with a = 0.9 obtained by monitoring the graph of 
X\t)/r. 



L 


r eq (SSU) 


T eq (CU) 


128 


1700(100) 


800(120) 


256 


6000(750) 


2000(200) 


512 


23000(2000) 


4500(500) 


1024 


101000(8000) 


10000(800) 




L 



FIG. 6: Fit of equilibrium times to the power law r e<3 cx L a 
for the six-state Potts chain (a = 0.9). (a) cluster updates 
and (b) single-spin updates. 

r eq = 23000 ± 2000 MCS for the cluster- and single-spin 
updates respectively; in spite of the large uncertainty, the 
reduction in terms of equilibrium time brought about by 
our method is clearly visible. Results for other lattice 
sizes are summarized in Table IIIII A fit to the power 
law r eq oc L z eq (see Fig. |SJ) yields the scaling exponents 
z eq — 1.96(5) and z eq = 1.21(3) for the single-spin and 
the cluster updates respectively. Here again, we think 
that lower diffusion currents in the latest case account 
for the higher pace at which the random walker reaches 
the equilibrium distribution. 



C. Overall CPU demand for LR models 

We now discuss CPU demand in the case of LR models, 
and concentrate on the gain in CPU resources brought 
about by the optimization schemes proposed in Sec. 11111 
Assuming a decently efficient algorithm implementation, 
this indicator yields a rough account of the real algo- 
rithm complexity, although it should be mentioned that 
it is usually an elaborate task to estimate this quantity 
rigorously, partly because its value hinges heavily on a 
variety of implementation, CPU architecture and com- 
piler dependent properties. We decided to measure CPU 
times over a series of one-hour long simulation runs on 
a handful of distinct CPU architectures, including Intel 
Pentium and Xeon at 2.4 and 3.2GHz. Figure [7| sketches 
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FIG. 7: CPU time per MCS and per spin for the long- 
range Potts chain. Triangles indicate typical CPU times for 
the local- update algorithm (SSU), irrespective of q and a. 
Squares refer to our algorithm (CU) with LR specific opti- 
mizations included; for q = 3 and q — 6, estimates were 
determined by averaging over the indicated a values. 

averages of the CPU (user) time per MCS and per spin, 
where small fluctuations might be attributed to the effect 
of varying CPU cache sizes amidst our clutch of CPU's. 
While for the local-update implementation the demand in 
CPU per spin grows linearly with the number of spin, it is 
roughly constant over a fairly large range of lattice sizes 
in the case of our cluster-update algorithm. Moreover, 
our method already outperforms the local-update scheme 
starting from several hundreds spins, with nonetheless an 
increased footprint for higher q values which is accounted 
for by the correspondingly higher number of FFT's to 
be computed. This, however, clearly demonstrates the 
breakthrough that our method brings about for the study 
of long-range models, paving the way for precise tests of 
finite-size scaling. 

V. TWO-DIMENSIONAL NN POTTS MODEL: 
COMPARISON WITH EXACT RESULTS 

In order to check that our algorithm did not pro- 
duce systematic errors, we computed transition temper- 
atures and interface tensions for the two-dimensional q- 
state Potts model (q = 7, 10) with nearest-neighbor (NN) 
interactions and helical boundary conditions. Results 
regarding the dynamic characteristics of our algorithm 
for this model were reported in Sec. IIVI we will con- 
centrate here on precision matters. For q = 10, we 
obtained T C (L) = 0.70699(5), 0.70491(5), 0.70300(2), 
0.70278(1), 0.70164(1), 0.701328(4) and 0.701249(2) for 
L = 16, 20, 30, 32, 64, 128, and 256, where T c was 
determined from the location of peaks of the specific 
heat. C v was computed directly from the estimated 
density of states, and then refined from an additional 
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production run of length 10 7 MCS. The error was es- 
timated by means of the jackknife method. Following 
standard finite-size scaling theory at first-order phase 
transitions, we collapsed C V (T)/L 2 vs (T — T C )L 2 over 
the five highest lattice sizes and found an infinite size 
temperature T c (oo) = 0.701236(3) in very good agree- 
ment with the exact value 0.70123157 . . . The same pro- 
cedure applied to q — 7 and L = 64, 128, and 256 yielded 
T c (oo) = 0.773059(1) which again matches perfectly the 
exact value 0.7730589 . . . We estimated the interface ten- 
sion (i.e., per unit length of the interface) f s between the 
ordered and disordered phases from the histogram of the 
energy, reweighted at a temperature where energy peaks 
have the same height js^: 2/ s = — L^ 1 \nP m i n , where 
the factor 2 arises from the use of periodic boundary 
conditions. Here, P m in denotes the minimum of the his- 
togram between the two energy peaks; peak heights are 
normalized to unity. We computed f e directly from the 
density of states, and estimated the error from the addi- 
tional production run. In this respect, it should be noted 
that estimating interface tensions directly from the den- 
sity of states generally yields values that lie below those 
computed from histograms collected during production 
runs. Our algorithm allowed us to determine f s with a 
four-digit precision for sizes up to L = 256 and nonethe- 
less rather modest statistics. For the seven-state model, 
we obtained 2f s = 0.0336(6), 0.0294(1), 0.02631(8), and 
0.02384(9) for L = 32, 64, 128, and 256; a linear fit 
of the form X ~ £(oo) + c/L [68| performed over the 
three largest sizes (i.e., for L above the disordered phase 
correlation length £ ~ 48 [6(]]) yielded the infinite size 
value 0.02230(11), still above the exact value 0.020792, 
yet closer to it than estimates reported in several previ- 
ous studies [UllliJ. 



VI. LR POTTS CHAIN: ERROR ESTIMATES IN 
THERMODYNAMIC AL DATA 



In this section, we discuss the precision of our results 
for the g-state Potts chain with algebraically decaying in- 
teractions, i.e., J(r) = l/r 1+a . Our purpose is twofold. 
First, we estimate the error in the density of states n(E) 
obtained from the Wang-Landau algorithm, so that we 
can obtain a better insight into the benefit of our method 
with regards to the iterative calculation of n{E). Second, 
we determine confidence intervals on reweighted averages 
computed from an additional production run. Since com- 
puting thermodynamic quantities from a production run 
does not require that the histogram be perfectly flat, nor 
that the estimate of the density of states be perfectly ac- 
curate, this reduces to estimating the gain in precision 
brought about by lower autocorrelation times. 



A. Statistical error of the density of states 



In order to compare the error in the density of states 
produced by the single-spin update implementation and 
our method, we performed for each method a series of 
12 independent simulations with the Wang-Landau algo- 
rithm, all starting with the same initial guess of the den- 
sity of states. The model parameters were set to q = 6, 
a = 0.9 and L = 512. This choice of parameters guaran- 
tees that, in spite of the modest lattice size we consider, 
autocorrelation times differ by a sufficient amount be- 
tween the single-spin updates method and our method, 
so that the benefit may be clearly interpreted in terms of 
decorrelating capabilities. The initial guess of S(E) was 
scaled up from an estimate obtained at L = 256, and 
the updating factor of the Wang-Landau algorithm was 
initially set to In / = 5. We did not make use of all im- 
provements to the original Wang-Landau algorithm, as 
proposed by Zhou and Bhatt in [34J, since these would 
have partly overshadowed the gain produced solely by 
lower autocorrelation times. Indeed, we mainly focused 
on the systematic error (rather than the whole statisti- 
cal error) that may show up during the first iterations. 
It was shown in [34| that this systematic error results 
from the combination of a large In / coefficient with the 
presence of strong correlations between adjacent binning. 
We thus simply relied on the original histogram flatness 
criterion to switch from one iteration to another, and di- 
vided In / by the same amount (namely, 5) after each 
iteration which passed the flatness check. We found out, 
however, that using the criterion in [34[ instead, that is, 
averaging n(E) on multiple independent runs after each 
iteration, and switching to the next iteration only after 
a given number of entries was recorded in the histogram 
(see Eq. (12) in HJ), led to markedly lower statistical 
errors. As illustrated in Fig. [SJ the statistical error in 
the density of states is clearly improved by our method. 
In particular, cluster updates lead to a spread of the er- 
ror over the whole energy axis. In this respect, and as 
already mentioned in Sec. IIVI the lower diffusion rates 
associated with collective updates in the critical energy 
region offer a clear benefit. As expected from the ar- 
guments of Zhou and Bhatt, the reduction is also more 
marked for In/ = 0.04 than for In/ = 10 -7 , and the 
systematic error brought about by correlations between 
successive binning is indeed partly tamed by a lower 
Wang-Landau modification factor. Finally, we show in 
Fig. El the resulting statistical error in the specific heat, 
since thermodynamical averages are the relevant quanti- 
ties in the first place. C v was computed directly from the 
estimated density of states n(E), i.e., according to the 
formula C v (kT) = {(E 2 ) kT - (E) 2 kT )/(L kT 2 ), where 

(E n ) = (£ s £M£)e~ B/feT )/(£ B ™(£) e_E/feT )- For 
long-range models, energy levels are not equally spaced, 
and it should be noted that too large histogram bins may 
cause a systematic deviation of the averages as well. We 
paid attention to this by comparing our results for several 
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FIG. 8: Statistical error in the density of states of the six- 
state Potts chain for two distinct modification factors In/ 
of the Wang-Landau algorithm. The statistical errors were 
obtained from 12 independent runs. The parameters of the 
model are a = 0.9 and L = 512. (a) and (b) correspond to 
our method and the local-update algorithm respectively. 
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FIG. 10: Specific heat for the three-state Potts chain with 
a = 0.5 as obtained with our method. 



B. Transition temperatures, Binder cumulants and 
interface free energies 
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FIG. 9: Graph of the specific heat for the six-state Potts 
chain (a — 0.9 and L = 512) obtained directly from the final 
estimate of the density of states with (a) our method and (b) 
the local-update algorithm. The inset shows the relative error 
en(C v )/C v . 



bin widths, and made sure that the systematic deviation 
engendered was always lower than the statistical error 
itself. As shown in the inset of Fig. [51 the accuracy of 
the estimation of C v is larger by nearly an order of mag- 
nitude at the transition temperature. Incidentally, we 
observe that this is comparable to the gain in terms of 
autocorrelation times, as already presented in Fig. [3] 



We now discuss some of our results for the three-state 
Potts chain, for which we performed extensive simula- 
tions for sizes ranging from L — 128 to L — 65536. As 
opposed to higher values of q, there exists indeed a large 
collection of numerical studies for q — 3, so that compar- 
ison with previous estimates is easier. Table Hvl reports 
our values for transition temperatures and peaks of re- 
sponse functions for a — 0.5. Both the specific heat 
C v and the susceptibility \ were computed from pro- 
duction runs whose length varied between 10 6 and 10 8 
MCS depending on the lattice size, and error bars were 
computed by means of the jackknife blocking method. 
We performed these production runs twice, first using 
single-spin updates, and then using our method, yet in 
both cases with the same estimate of the density of states. 
Figure shows the graph of C v as obtained with clus- 
ter updates. We mention that, for L > 4096, the local- 
update implementation was simply intractable as a result 
of excessive computation times. For all sizes, our results 
match within error bars for both methods, and it should 
be noticed that, for the two largest sizes, we obtain esti- 
mates of finite-size transition temperatures accurate up 
to the fifth digit. 

In order to determine infinite-size transition temper- 
atures, we performed a fit of the finite-size tempera- 
tures T C (L) reported in Table IIVI to the power law 
T C (L) - T c (oo) = a/L. This is illustrated in Fig. El 
where for the sake of clarity, only data obtained using 
cluster updates are reported. An important effect we no- 
ticed is the presence of a crossover around L = 16384 
for T C (C V ), where the finite-size transition temperature 
reaches a maximum; a similar behavior can be witnessed 
for T c (x) with a maximum occuring around L = 4096. 
In view of this, we first performed a fit over lattice sizes 



14 



TABLE IV: Estimates of peaks of the specific heat C v and the 
susceptibility x> an d corresponding effective transition tem- 
peratures for the three-state LR Potts chain with a — 0.5. 
Error calculations were carried out by means of the jackknife 
method applied to a single production run. The number of 
MCS per production run is the same for both methods, yet 
varies between 10 6 and 10 7 from the smaller to the larger 
lattice sizes. 



L 


T C (C V ) 




/"ITTLCLX 






(CU) 


(SSU) 


(CU) 


(SSU) 


128 


1.6450(18) 


1.645(3) 


3.55(2) 


3.55(3) 


256 


1.6607(2) 


1.6607(13) 


4.86(2) 


4.88(5) 


512 


1.6741(9) 


1.675(1) 


6.54(3) 


6.47(6) 


1024 


1.6815(2) 


1.6815(17) 


9.14(8) 


9.10(24) 


2048 


1.6856(3) 


1.685(1) 


13.63(15) 


13.73(80) 


4096 


1.68742(9) 


1.6875(10) 


22.21(34) 


21.9(2.2) 


8192 


1.68801(7) 




40.28(44) 




16384 


1.688031(34) 




79.46(35) 




32768 


1.687851(12) 




164.1(4) 




65536 


1.687749(09) 




332.8(6) 






x c\XJ 




max 

X 






(CU) 


(SSU) 


(CU) 


(SSU) 


128 


1.6793(14) 


1.679(4) 


3.44(3) 


3.46(3) 


256 


1.6837(2) 


1.6837(15) 


6.09(3) 


6.13(5) 


512 


1.6864(8) 


1.6877(15) 


10.81(7) 


10.86(13) 


1024 


1.6882(3) 


1.688(2) 


19.73(28) 


19.8(5) 


2048 


1.6887(2) 


1.6882(15) 


37.6(5) 


37.5(1.8) 


4096 


1.68869(9) 


1.6887(11) 


75.4(1.2) 


74.6(6.7) 


8192 


1.68842(7) 




165.2(1.8) 




16384 


1.688148(35) 




369.8(1.1) 




32768 


1.687870(20) 




827(2) 




65536 


1.687773(16) 




1754(3) 





ranging from L = 256 to L = 4096: for this range of 
lattice sizes, data points lie neatly on a straight line 
within error bars for both T c (x) and T C (C V ). As regards 
the peaks of the specific heat, a fit performed over the 
range L — [256 — 8192] yielded the same estimate of the 
infinite size temperature, which seems to indicate that 
L = 8192 still belongs to the region where a linear fit is 
reliable. Infinite size temperatures are reported in Ta- 
ble [V] the temperatures computed from both methods 
compare very well with each other and with the value of 
1.691(3) reported in using a multicanonical approach 
and medium lattice sizes. The value obtained from a 
previous MC study based on a canonical version of the 
Luijten-Blote algorithm |6^| lies clearly above our esti- 
mate, while estimates obtained from a transfer matrix 
method 1701 and a real-space renormalization group ap- 
proach [71j fall markedly below our values. Finally, the 
best estimate determined so far (to the best of our knowl- 
edge) with a numerical approach, namely, the value of 
T c = 1.68542 obtained in 72] with the cluster mean- 
field method, lies slightly below ours. The occurence of 



TABLE V: Infinite size transition temperatures computed 
from a fit of finite-size temperatures to a power law of the 
form T C (L) — T c (oo) — a/L. The first column indicates the 
range of lattice sizes that were used in the fit. Results from 
previous studies are shown for comparison in the last column: 
Ref. MC study based on Berg's multicanonical method and 
sizes up to L — 400; Ref. [TJ cluster mean-field method; Ref. 
j pFojl transfer matrix method; Ref. [6Sj MC study based on 
the Lujiten-Blote cluster algorithm and sizes up to L — 3000; 
Ref. UM real -space renormalization group approach. 



[Li-Lz] T C (C V ) 

(CU) (SSU) 
256-4096 1.6889(1) 1.6888(8) 

256-8192 1.6889(1) 
16384-65536 1.68764(1) 

Tc(x) 

(CU) (SSU) 
256-4096 1.6893(1) 1.6892(6) 

16384-65536 1.68765(2) 

Ref. [7] 1.691(3) 
Ref. [72] 1.68542 
Ref. [70] 1.664 
Ref. [69] 1.72 
Ref. [71] 1.41 



a crossover for both response functions led us to perform 
a distinct linear fit restricted to the three largest lattice 
sizes (see the inset of Fig.lTTland results in Table 0): this 
yielded T c (oo) = 1.68764(1) and T c (oo) = 1.68765(2) for 
T C {C V ) and T c (x) respectively. In terms of the number 
of digits of precision, these estimates are comparable to 
the value reported in although they still lie above 
it by around 0.13%. This is all the more surprising that, 
even if the accuracy of the cluster mean-field method de- 
creases with increasing a (with a < 0.5 being quoted by 
the authors in as the interval where the method pro- 
duces its most accurate estimates), this method always 
produces — by construction — upper bounds for transi- 
tion temperatures, irrespective of the size of the clusters. 
Noteworthy enough, a similar discrepancy was reported 
by the authors of an MC study of the LR Ising model 
based on the Luijten-Blote algorithm 73], with the clus- 
ter mean-field approach [jil yielding values underesti- 
mated by 0.8% for a = 0.5. Since the authors in [72ll74| 
performed an extrapolation procedure to compute the 
infinite-size estimate from temperatures obtained at fi- 
nite cluster size, the discrepancy may thus be attributed 
to their specific extrapolation procedure rather than to 
the method used to produce finite-size estimates. 

In order to shed light on the presence of a crossover, 
we compared the lattice size at which the crossover oc- 
curs with two characteristic lengths. One such length 
is the lattice size L co where the peak of the re- 
duced Binder cumulant of the energy, namely, U™ ax = 

max((_E 4 ) / (E 2 } 2 ), experiences a minimum. At a 
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FIG. 11: Solid lines show linear fits of finite-size transition 
temperatures vs 1/L for the three-state LR Potts chain with 
a = 0.5, performed over sizes ranging from L = 256 to 
L — 4096. T C (C V ) and T c (x) are defined from the peaks 
of the specific heat C v and the susceptibility \ respectively; 
all data were obtained using cluster updates (CU). The in- 
set shows a detailed view at larger lattice sizes, along with a 
linear fit carried out over the three largest sizes L — 16384, 
L — 32768 and L = 65536. Error bars are smaller than 
the size of symbols, except where explicitely indicated. The 
horizontal arrow shows the infinite-size estimate T c = 1.68542 
obtained by Monroe using the cluster mean-field method |7^. 
while the vertical dashed error bar refers to the infinite-size 
estimate T c — 1.691(3) obtained in 0- Finally, L co indicates 
the lattice size at which the graph of the peaks of Ul vs the 
lattice size changes slope (see Fig. 1121 . and £ refers to our 
estimate of the correlation length. 



1.22 




0.006 



0.008 



FIG. 12: Peak of the cumulant of the energy Ul = 
(i? 4 ) / (-E 2 ) as a function of the lattice size for the three- 
state LR Potts chain with a — 0.2 and a = 0.5. The inset 
shows a magnification near the origin for a = 0.5; L co indi- 
cates the approximate lattice size at which the curve changes 
slope, crossing over to a first-order transition behavior. 



second-order phase transition, U L nax tends trivially to 
1 in the thermodynamic limit, while it tends to a dis- 
tinct value if the transition is of the first order. As il- 
lustrated in Fig. El the graph of JJ L nax with respect to 
the lattice size is clearly characteristic of a first-order 
transition, with an infinite-size value lying close to 1.033. 
A minimum occurs at L co ~ 8192 and — as shown 
in the inset of Fig. ^] — the crossover occurs slightly 
above L co for T C (C V ), and slightly below for T c (x)- For 
a = 0.2, the same correlation is witnessed by our results, 
with the change of slope of \J™ ax taking place around 
L co = 2048, and a change of behavior for T C (C V ) occur- 
ing near L — 4096. Another characteristic length is the 
correlation length of the disordered phase, which we com- 
pute in a subsequent part of this section: for a = 0.5, our 
estimate of £d ~ 5000 lies, here again, very close to the 
change of behavior of T C (C V ) and T c (x) (see the inset of 
Fig. HD); for cr = 0.2, £ d ~ 600, and T C (C V ) and T C ( X ) 
depart from the straight line at L ~ 3000 and L ~ 1000 
respectively. For both values of a, we thus observe the 
same qualitative behavior, namely, the crossover occurs 
at sizes slightly above L co and £. We think that this 
may be attributed for one part to the fact that, as long 
as the lattice size is smaller than the correlation length, 
the finite-size scaling behavior of the model is that of 
a continuous transition, whereas it is first-order-like at 
larger sizes. Another important finite-size effect may be 
specifically linked to the long-range nature of the interac- 
tion: in 7], it was suggested that, irrespective of the type 
of periodic boundary conditions implemented, the finite 
size of the lattice shifts the (effective) decay parameter 
a towards the mean-field regime, in a way that is more 
pronounced at small lattice size; this effect may thus com- 
pete with the previous effect in a non-trivial manner. It 
is therefore apparent that the power law used in our fits 
is not sufficient to model the finite-size scaling behav- 
ior in an accurate way over the entire range of lattice 
sizes; the derivation of an appropriate finite-size scaling 
law including corrections to scaling is left, however, to a 
subsequent work. 

We also note in passing that, for a = 0.5, relying on 
the Binder cumulant to assess the first-order nature of 
the transition requires simulating the system up to sizes 
that are far beyond the capabilities of single-spin update 
implementations. In particular, performing a power-law 
fit of Ul restricted to sizes below L ~ 3000 would yield 
underestimated values. Our results in Fig. 1121 show that 
the infinite size value lies around 1.033, and thus that the 
transition is stronger than suggested for instance in |69| . 

Although a precise determination of correlation lengths 
for the LR Potts chain is beyond the scope of this work, 
we tried to obtain rough estimates of them from the 
finite-size scaling behavior of the interface free energy F s . 
As in Sec. we computed a histogram N(T, E) of the 
energy reweighted at the pseudo-transition temperature 
T eq h where both peaks of the histogram have equal height 
(see Fig 11311 . After normalization of the peak heights to 
unity, 2F S is given by — lnP m „, where P m in stands for 
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FIG. 13: Graph of the free energy F(T, E) = — In N(T, E) 
for the three-state LR Potts chain with a = 0.5. N(T, E) is 
the histogram of the energy reweighted at a pseudo-transition 
temperature T eq h where both peaks have the same height. For 
the four lattice sizes shown here, lattice configurations cor- 
responding to phase coexistence are suppressed by a factor 
ranging from 0.1 to 10 -6 with respect to pure phase config- 
urations; for the three largest sizes, we note that a canonical 
simulation is clearly intractable. 
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FIG. 14: Fit of the interface free energy F s to a power law 
of the lattice size for the three-state LR Potts chain with 
a = 0.5. All estimates of F s were obtained with our method. 



the minimum of the histogram between the two energy 
peaks. With this definition, which is analog to that used 
in models with nearest-neighbor interactions, 2F S corre- 
sponds to the free energy cost associated — when peri- 
odic boundary conditions are used — with the creation of 
mixed ordered-disordered configurations from pure phase 
configurations. As we will witness, there is strong evi- 
dence that the dimension of the interface is no longer an 
integer. By fitting the interface free energy to the power 
law F s oc L a , we obtain a very good fit for sizes ranging 
from L = 256 to L = 65536, yielding a = 0.91(2) and 



2F S /L a = 0.0004 in the thermodynamic limit. This is 
illustrated in Fig. HI For <r = 0.2, we obtain a = 0.74(3) 
from a fit performed over sizes ranging from L = 512 to 
L = 8192. In view of the expected behavior for nearest- 
neighbor interactions, namely, F s scales to leading order 
as a power of the lattice size with an exponent given by 
the dimension of the interface [7{| , this suggests that the 
effective dimension of the interface lies between and 1 
for long-range chains. This assumption is also supported 
by the fact that the fits of F s / L in j6^ exhibit important 
finite-size corrections, while our fit with a non-integer ex- 
ponent does not suggest such corrections. 



Finally, we can estimate the correlation len gth by 
transposing the argument that was proposed in [59j to 
relate the interface tension f s of the nearest-neighbor 
Potts model to the correlation length ^ of its disordered 
phase. This argument relies on two ingredients: (i) the 
correspondance between the order-order interface tension 
and the correlation length of the disordered phase, which 
results from duality; (ii) the assumption of complete wet- 
ting, which implies a simple relation between the order- 
order and order-disorder interface tensions, where the lat- 
ter is the quantity f s we measure from the reweighted his- 
togram of the energy. The second assumption is rigorous 
if q is large enough, although the authors in [5^ suggest 
that it should stay valid for all q for which the transition 
is of the first order. The argument leads to the rela- 
tion CJ 1 = 2/ s = 2F S / L for the nearest-neighbor model: 
in view of the finite-size scaling behavior of F s reported 
above for the LR chain, it seems natural to transpose this 
relation by defining the interface tension as f s = F s / L a 
for LR chains, and dimensional analysis thus suggests to 
use the relation £p = 2f s = 2F S /L a in the LR case. 
This yields an estimate of £<j ~ 5000 for the LR chain at 
a = 0.5, a value that seems at least consistent with the 
fact that the change of slope of U™ ax occurs at a lattice 
size slightly above this size (see Fig. 1120 . For a = 0.2, 
the same relation yields £<j ~ 600, and here again this 
estimate is consistent with the behavior of JJ™ ax ; in ad- 
dition, and as expected, £<j decreases with a, i.e., as the 
transition becomes stronger. Although there is no rig- 
orous proof that the ingredients invoked in |59| retain 
their validity in the case of LR interactions, it seems 
that a straighforward transposition to LR interactions 
yields reasonable results, even though the topology of 
the interface between the ordered and disordered phases 
is certainly far more complex than in nearest-neighbor 
models. In addition, we make use of Infinite Image Peri- 
odic Boundary Conditions > so that the factor 2 in the 
definition of the interface tension might be questionable 
in LR models. Altogether our estimates should thus be 
taken as very rough ones. 
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FIG. 15: Microcanonical inverse temperature (3(E) — 
dS(E)/dE computed from the estimated density of states us- 
ing a spline interpolation, for the three-state long-range chain 
with a — 0.4, 0.5, and 0.6 from bottom to top. 



VII. COMBINATION WITH THE TRANSITION 
MATRIX METHOD 

In this section, we examine how our method can be effi- 
ciently combined with the transition matrix method |25[ . 
We show in particular that transition matrices represent 
a very efficient way of estimating the microcanonical tem- 
perature (3(E) used to compute cluster bond probabili- 
ties when nothing is known initially about the density 
of states. We also discuss how the estimated (3(E) can 
then be used as an efficient predictor to speed up the 
convergence towards the ground state during the early 
iterations of the Wang-Landau algorithm. 



A. Efficient estimation of (3(E) and bootstrapping 

As seems obvious from the scheme presented in Sec.lTTl 
one of the basic requirements of our algorithm is to have 
an estimate of (3(E) at our disposal over the whole en- 
ergy axis in order to compute cluster bond probabilities. 
One rather simple way of estimating (3(E) is to com- 
pute it from the current estimate of the density of states 
n(E) using a unitc-diffcrence scheme, i.e., in real-time 
in the course of the iteration scheme. This is the most 
tractable approach if one decides to rely solely on Wang- 
Landau's algorithm to estimate n(E). During early iter- 
ations, however, the estimate of n(E) is somewhat rough 
and it is necessary to resort to a spline interpolation in 
order to obtain a sufficiently smooth estimate of (3(E). 
Since the unequal spacing of energy levels in long-range 
models renders an interpolation scheme for n(E) abso- 
lutely mandatory 0, (3(E) is already available to us for 
free. Figure shows estimates obtained with this ap- 
proach for the three-state long-range chain with various 
interaction ranges, computed after ten iteration steps of 
10000 measurements each. We note in passing that the 
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FIG. 16: Mean acceptance rate as a function of the energy 
per spin for the six-state long-range Potts chain with a = 0.5, 
and L = 512 spins (strong first-order regime) for three distinct 
estimates of (3(E). (a) best estimate, as given by the ultimate 
iteration of the Wang-Landau algorithm; (b) (3(E) scaled by 
0.9; (c) /3(E) scaled by 1.1. 



presence of a clearly visible minimum in the three cases 
results from the first-order nature of the transition. For 
sufficiently short-range interactions, and when no ran- 
dom disorder is present, the microcanonical entropy S(E) 
scales quite gently with the lattice size, and it is also 
perfectly feasible to use the value of (3(E) obtained at a 
smaller lattice size as an initial guess. 

In any case, it is crucial for the performance of our 
algorithm that we should compute (3(E) to sufficient ac- 
curacy. Indeed, we have found that any departure from 
the ideal line results in poorer performance, as illustrated 
in Fig. 1161 The curve (a) in the figure shows the mean 
acceptance rate as a function of the energy for an esti- 
mate of (3(E) obtained after the ultimate Wang-Landau 
iteration and a modification factor In / = 10~ 7 . Curves 
(b) and (c) show the same quantity for microcanoni- 
cal temperatures that were under- and overestimated by 
10%. The poor estimate of (3(E) causes a marked de- 
crease of the acceptance rate in the transition region 
(around E/L ~ —1-5), from around 100% to nearly 40%. 
Tunneling times obviously experience a corresponding in- 
crease, from 243 for the best estimate, to 737 and 1150 
for the under- and overestimated temperatures, respec- 
tively. This can be easily explained, if one considers that 
the efficiency of cluster updates reaches a maximum at 
the percolation threshold. Any departure of the estimate 
of (3(E) from the ideal line results in a shift between the 
temperature at which clusters percolate (which depends 
on (3(E)) and the effective temperature of the system 
(which is given by dS(E)/dE). This behavior has been 
observed in the context of canonical simulations of disor- 
dered systems, e.g., the Random Field Ising model [76| . 
where the presence of randomness depresses the critical 
temperature. In this case, using the (canonical) Simula- 



18 




In this case, T OQ (E — > E 1 ) varies sufficiently smoothly for 
the following approximation scheme to be valid: 



FIG. 17: Symbols show the microcanonical inverse tempera- 
ture /3(E) computed from the transition matrix accumulated 
over 2000 MCS, for the six-state LR model (a = 0.5) contain- 
ing 512 spins. The estimate obtained from an interpolation 
scheme after the ultimate iteration is shown as a solid line for 
comparison. 



tion temperature to compute the bond probabilities sim- 
ply results in a growing shift between the critical temper- 
ature and the percolation threshold as the randomness is 
increased. 

In view of the previously mentioned requirements on 
the estimation of (3(E), it is clear that, if one does not 
have a reliable guess of f3(E) at hand before the simula- 
tion starts, an efficient scheme must be devised in order 
to compute P(E) in the early stage of the Wang-Landau 
algorithm. This is vital at this stage, because the exceed- 
ingly noisy estimate of the density of states makes it more 
likely to obtain under- or over-estimated values for (3(E). 
An efficient approach in this regards relies on transition 
matrices [2^, []^J . This method produces highly precise 
estimates of (3(E), although it has an inherently higher 
cost in term of computer load. The starting point is the 
Broad Histogram equation |53l ITsf : 

nfflT^E -> E') = n(E')T 00 (E' -» E), 

where T oc (E — > E') is the transition matrix ele- 
ment between ener gy l evels E and E' (also denoted as 
(N(a,E' — E)) E in This quantity contains the mi- 

crocanonical average at energy E of the number of po- 
tential single-spin moves from a state a of energy E to 
a state a' of energy E 1 . It is estimated by accumulating 
a double-entry histogram h(E, AE) containing the num- 
ber of potential moves from E to E + AE each time the 
energy level E is visited. Long-range interactions lead to 
energy levels which are irregularly spaced, with in par- 
ticular a few gaps in the vicinity of the ground state 0, 
and it is necessary to choose an axis bin small enough 
to minimize discretization errors, and at the same time 
sufficiently large to contain at least a handful of entries. 



13(E) 



1 T^E^E + AE) 
AE n T^E-tE-AE' 



where the actual estimate is obtained by weighted- 
averaging over several values of AE. As illustrated in 
Fig-Elfor the six-state LR chain, the estimation of (3(E) 
from the transition matrix elements is reliable already 
after 2000 MCS, which roughly corresponds to 50 round- 
trips between the upper and the lower energy range. For 
long-range models, each estimation of the number of po- 
tential moves requires of order L 2D operations (as op- 
posed to L D for nearest-neighbor interactions). However, 
we have shown in Sec. lIIII that a single cluster update can 
demand as little as 0(L D In L D ) operations when long- 
range specific optimizations are carried out; hence esti- 
mation schemes based on transition matrices partly scup- 
per the benefits of these optimizations, and should there- 
fore be employed only as a bootstrap procedure when 
nothing is known yet about the microcanonical temper- 
ature. Conversely, models with nearest-neighbor inter- 
actions do not undergo such a drawback, and make the 
transition matrix approach a perfectly transparent one 
from the viewpoint of algorithm complexity. 



B. 



Efficient predictors for the Wang-Landau 
algorithm 
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FIG. 18: The graph shows the number of MCS needed to 
reach the ground-state (dashed horizontal line) of the six- 
state Potts chain (a = 0.5 and L — 512) for an initially 
unknown density of states, using three distinct schemes: (a) 
and (b) predictor based on (3(E), local- and collective- update 
algorithms respectively; (c) no predictor (S(E) = 0, Vi5), 
local-update algorithm. 

Finally, we discuss how (3(E) can be used as an efficient 
predictor during the early stage of the Wang-Landau 
algorithm when nothing is known about the density of 
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states. In the original implementation of this algorithm, 
we start with S(E) = for all energy levels, and sim- 
ply increment S(E) by the modification factor In / each 
time the corresponding energy level is visited. One of 
the main drawbacks of this approach is that the Markov 
chain tends to wander around a fairly long time in the up- 
per energy range, until eventually enough visits have been 
recorded in the histogram for the system to start explor- 
ing low-e nerg y levels. This point has already been men- 
tioned in |3J] , where it was suggested that starting with 
a good initial guess of S(E) was more efficient in terms of 
the number of histogram entries required to reach the fi- 
nal estimate, than performing a multi-range run with no 
initial guess at all. To circumvent this drawback when no 
initial guess is available, we therefore propose to use f3(E) 
to predict S(E) for energy levels that are visited for the 
first time, and thus for which S(E) is not available (i.e., it 
is set to S(E) = in the original implementation of the 
Wang-Landau algorithm). A linear prediction scheme 
turned out to sufficiently efficient for our purpose. As il- 
lustrated in Fig. 1181 using a predictor brings about a gain 
of three orders of magnitude in the time needed to reach 
the ground state. Our method and the single-spin update 
method lead similar performance, with however a slightly 
better behavior when cluster updates are used. We note 
that the Markov chain stays initially somewhat longer in 
the upper energy range when cluster updates are used, 
since a good estimate of (i{E) is needed to build the clus- 
ters with the correct bond probabilities. We think that 
this approach would prove particularly useful when the 
characteristics of the model makes it impossible to ob- 
tain an initial guess of S(E) from simulations at smaller 
lattice sizes, e.g., in the presence of disorder or when the 
long-range interaction experience a slow decay. 

VIII. CONCLUSION 

In conclusion, we have developed a new Monte Carlo 
method which combines in an efficient and straightfor- 



ward way the benefits of flat histogram algorithms with 
the fast-decorrelating capabilities of cluster updates. It is 
suited for spin models with any number of interaction be- 
tween spins. Our formulation is versatile, and the method 
can be applied to a variety of density of states estimation 
schemes, including the Wang-Landau algorithm, Berg's 
recursion scheme or the transition matrix method. We 
have shown that using the microcanonical temperature 
to compute cluster bond probabilities leads to a dras- 
tic reduction in effective autocorrelation times, tunnel- 
ing times and equilibration times. In the context of the 
Wang-Landau implementation, the reduced correlation 
between successive binning of the energy histogram yields 
a lower error in the estimation of the density of states, 
and as a result more reliable estimates of thermodynamic 
averages. Several schemes for the estimation of the mi- 
crocanonical temperature were proposed, amongst which 
an efficient procedure which harnesses the power of the 
transition matrix method, and allows us to bootstrap 
the algorithm even if nothing is known initially about 
the density of states. Finally, we carefully examined 
the precision of our method in the case of spin models 
with power-law decaying interactions. Here, our method 
proves all the more powerful that it is able to reduce 
the algorithm complexity to that of a short-range model 
having the same number of spins. This allowed us to 
study several finite-size effects at large lattice sizes, oth- 
erwise largely out of reach of conventional local-update 
implementations. In particular, we found out that the 
interface free energy scales perfectly well with a power of 
the lattice size, yet with a non-integer exponent which 
lies between and 1. This, we think, is accounted for 
by the complex topology of the phases in coexistence in 
long-range models. A more detailed study, including a 
deeper insight into the topological properties of the gen- 
erated clusters and the estimation of correlation lengths 
at large lattice sizes, would be very promising. We think 
that our method clearly draws this challenge within com- 
putation range. 
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